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INTERFACES FOR AN OPEN SYSTEMS 
SERVER PROVIDING TAPE DRIVE 
EMULATION 

CROSS-REFERENCE TO RELATED 
APPLICATIONS 

This application claims priority from U.S. Provisional 
Application No. 60/052,055, filed Jul. 9, 1997, which is 
incorporated herein by reference in its entirety for all pur- 
poses. 

BACKGROUND OF THE INVENTION 

The present invention relates generally to data storage 
systems and more particularly relates to tape drive emulation 
(TDE) systems. 

Many data processing systems utilize tape drives for 
storage of data. Channel interfaces and commands generated 
by a data source to control the transfer of data from a host 
computer to a tape drive are well-known in the art and will 
not be described in detail here. One example of an operating 
system for managing data transfer between a host computer 
and a tape drive is the MVS system manufacture by IBM 
Corporation. 

In a tape drive emulation system, the data output by the 
host is not actually written to a tape drive and data input by 
host is not actually read from a tape drive. Instead, in one 
type of TDE system, the data is input from and output to 
staging disks. 

SUMMARY OF THE INVENTION 

According to one aspect of the invention, an improved 
interface facilitates communication between a library man- 
agement system (LMS), operating on a host computer, and 
an Open Systems Server (OSS) containing virtual tape 
drives (VTDs) and operating as a TDE system. 

According to another aspect of the invention, the interface 
supports two different kinds of communications. One type is 
a dump of a large amount of data that is not suitable to 
real-time short transactions. In this case, the data to be 
provided to the LMS is packaged in a virtual volume with a 
special name convention. Such a special virtual volume is 
called an "administrative volume." Periodically, the LMS 
requests an administrative volume, the OSS "mounts" the 
volume on a VTD, and the LMS reads the wanted informa- 
tion from the VTD. Thus, the administrative volume is used 
to communicate status, control, and configuration informa- 
tion between the LMS in the host and OSS, using a standard 
access method (tape or virtual tape). 

According to another aspect of the invention, the interface 
utilizes load display (LD) commands. LD commends are 
channel commands normally used to route messages to a 
tape drive's operator display accessory. The present inven- 
tion utilities LD commands to communicate policy and 
control information messages to the TDE system and to 
monitor its condition. Policy information reflects decisions 
by the user(s) and includes rules and preferences for the 
handling and disposition of data written to VTDs. These user 
decisions are unknown to OSS, but are communicated to the 
LMS in the host, and guide LMS in its operation. For 
example, the assignment by LMS of particular data to a 
virtual volume belonging to a virtual volume set (VSET) 
with which a collection of pre-programmed handling rules is 
associated is an expression of policy. The virtual volumes 
need a large amount of policy info for management. 

According to another aspect of the invention, the SBUS 
card slots are expanded to facilitate the use of a SPARC CPU 
to control a mass storage system. 



>6,791 Bl 

2 

According to another aspect, a special "Health Check" LD 
message is periodically sent. If a critical situation exists 
within OSS, a special error message will be generated and 
delivered to the operator by LMS. 
5 According to another aspect of the invention, SBUSs of 
two SPARC CPUs are connected to each ESCON interface 
for redundancy. 

Additional features and advantages of the invention will 
be apparent in view of the following detailed description and 
30 appended drawings. 

BRIEF DESCRIPTION OF THE DRAWINGS 

FIG, 1A is a block diagram of a preferred embodiment of 
j 5 the invention; 

FIG. IB is a block diagram of a data storage system 
including a TDE system; 

FIG. 2 is a flow-chart describing the steps of mounting an 
administrative volume; 
20 FIG. 3 is a perspective view of the channel interface 
hardware; 

FIG, 4 is a block diagram of a DSB board and an ESCON 
Interface daughter card; 
2 5 FIG. 5 is a perspective view of the channel interface 
hardware coupled to primary and alternate main processors; 
and 

FIG. 6 is a block diagram depicting redundant connec- 
tions to the CIFs. 

30 DETAILED DESCRIPTION OF THE 

PREFERRED EMBODIMENTS 

A preferred embodiment will now be described with 
reference to the figures, where like or similar elements are 
35 designated with the same reference numerals throughout the 
several views. FIG. 1A is a high level block diagram of a part 
of a tape drive emulation (TDE) system 10, also referred to 
herein as OSS 10, utilizing an embodiment of the present 
invention. 

40 A plurality of channel interfaces (CIFs) 12 are coupled to 
host I/O channels (not shown) to transfer data between the 
host and the TDE system. 

Each CIF 12 includes a host interface 14, an embedded 

45 controller 16, a data formatter 18 for performing data 
compression and other functions, an SBUS interface 22, a 
buffer memory 20, and an internal bus 24. In the preferred 
embodiment, the embedded processor 16 is a model 1960 
manufactured by Intel Corporation. 

50 The main controller 30 includes a main processor 32, 
main memory 34, an SBUS interface 36, and an internal bus 
38. In the preferred embodiment, the main processor is a 
SPARC computer manufactured by Sun Microsystems 
Incorporated. The CIFs 12 and main controller 30 are 

S5 coupled by a system bus (Sbus) 40. 

The tape drive emulation (TDE) system 10 stores host 
data on "virtual tape drives." In one preferred embodiment, 
the data is actually stored on staging disks. Because the TDE 
system 10 must interact with the host as if the data were 

60 actually stored on tape drives, a data structure called a 
virtual tape drive (VTD) is maintained in main memory 34 
for each virtual tape drive. Each VTD contains all informa- 
tion about the state of the associated virtual tape drive. 
FIG. IB is a high-level block diagram of a system in 

65 which a preferred embodiment of the invention is utilized. In 
FIG. IB, a host computer 50, for example an IBM main- 
frame computer, executes a plurality of applications 52. 
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In practice, the host computer 50 is typically running the information during a single transaction. This type of infor- 

MVS operating system manufactured by IBM. MVS pro- mation relates to complete and detailed status of the virtual 

vides the applications with I/O services, including I/O to an library in OSS and includes more information than can be 

automatic tape library (ATL) 54. The physical interface rapidly transferred using short messages. The high level 

between the applications 52 and ESCON tape drives 55 is 5 description of the function of the administrative volume 

the ESCON 3490 magnetic tape subsystem interface 55a. interface will now be presented with reference to FIG. 2. 

MVS, the ESCON interface 55a, and the host computer 50 When LMS requires information describing the status and 

are well-known and not a part of the present invention. contents of OSS, it uses the conventional facilities of the 

The preferred embodiment of tape drive emulation (TDE) operating system to allocate a VTD and request the mount- 
system 10, designated OSS 10 (open systems server), is 10 ingof an administrative volume. For example, it is necessary 
manufactured by the assignee of the present invention. OSS to periodically synchronize status and content information of 
10 maintains virtual tape drives 56 (VTDS) which emulate OSS with information in the tape management component in 
the physical ETDs 55. More details of the VTDs 56 will be use by the operating system. Utilities in LMS implement this 
presented below. The interface between an application 52 synchronization, 

and a VTD 56 is the OSS Emulated Device interface 57. is A special naming convention for the administrative vol- 

A library management system (LMS) software module 60 umes allows OSS to interpret the mount command as a 

resides on the host 50 and provides services to MVS and request for a particular body of status information. Different 

OSS. LMS 60 is responsible for management of the tape names specify different types of administrative volumes, 

library environment and performs such tasks as fetching and These administrative volumes appear to host applications 

loading cartridges into drives, returning unloaded cartridges 20 as ordinary volumes with IBM standard labels. According to 

to their home locations, etc. The interface between LMS 60 a convention used in the preferred embodiment, their vol- 

and OSS 10 is the library manager interface, with paths 62a ume serial numbers, the names by which they are known in 

and 62b based on two distinct protocols. host indexes of tapes and which are written on the tapes in 

The VTD 56 is a non-physical device that responds as if the V0L1 labels > arc generated by adding a single character 

it were a real device. In the currently described embodiment, 25 t0 a specified five character prefix, reserved to the exclusive 

the emulated physical device is an IBM-3490 tape drive. The use °^ LMS* 

VTD 56 responds to commands issued on a channel in the In the currently described embodiment, there are five 

same fashion as the emulated technology. types of administrative volumes defined: Audit List; Audit 

Host data is stored in volumes. A virtual volume is a „ List Discrepancies; Audit List Agreement; RAID Status 
collection of data and metadata that, taken together, emulate Re P ort i and oss Data Dump. Presently, only read-only 
a real tape volume. When "mounted" on a VTD, these administrative volumes, i.e., administrative volumes trans- 
virtual volumes are indistinguishable from real tape volumes fernn S information from OSS to LMS, are implemented, 
by the host computer. Each volume contains labels and one data set. 

In this context "data" refers to data output by the host to „ ^ " Audil List " data ^ a volume status record 

be stored on tape and "metadata" refers to information for each volume known lo the oss * ^ data set 15 a 

generated by OSS which permits the emulation of real tape read-only data set produced by the OSS when the appropn- 

drives and volumes. ate ad ministrative volume is mounted. 

An example will help clarify the meaning of the terms. If ^ " Audit List Discrepancies" data set also contains 

a host application intends to write data to tape, it requests 40 volume status records - 11 15 a write-only data set and will 

that a tape be mounted on a tape drive. LMS intercepts the contain a volume status record for everv volume 00 which 

request and causes a virtual volume to be mounted on a a host's tape management system component and OSS differ 

virtual tape drive to receive the application output, which is on v °l ume status. 

delivered by the ordinary tape output programs of the The "Audit List Agreement" data set also contains volume 

operating system. Blocks of data received by OSS are 45 status records. It is a write-only data set, and will contain a 

"packetized," the packets are grouped together in clusters volume status record for every volume on which the tape 

with a fixed maximum size, called "extents," and the extents management system component and the OSS agree on 

are written to staging disks. Often the extents containing volume status. 

data from one virtual tape are scattered over several disk The "RAID Status Report" data set contains a virtual 

drives. All information about the packetization, such as 50 volume set (VSET) usage record for each VSET known to 

packet grouping in extents and extent storage locations, OSS, a free space record for each RAID region and a single 

required to reassemble the volume for later use by the host OSS system status record. This data set is a read-only data 

is metadata. Part of the metadata is stored with each extent set produced by the OSS when the appropriate administra- 

and part is stored on non -volatile in OSS, separate from the tive volume is mounted. 

extent storage. 55 Information in the "OSS Data Dump" data set will contain 

LMS requires information concerning the contents of raw data that the OSS wishes to communicate to a host 

OSS to properly respond to host requests for accessing application. This data set is a read-only data set produced by 

VTDs and virtual volumes. It also needs information on OSS the OSS when the appropriate administrative volume is 

storage space usage to manage auxiliary operations which mounted. 

maintain enough free space to adequately receive new 60 In response to the host's mount request for an adminis- 

outputs. In the present embodiment, there are two primary trative volume containing a read-only data set, OSS builds 

interfaces for transferring information between LMS and an administrative data set from status information stored in 

OSS. In the discussion below, an interface means a protocol its data base. The type of information included in the 

or style of interaction between LMS and OSS, not neces- administrative data set depends on the name (the volume 

sarily its physical implementation. 65 serial number) included in the mount command. 

The first interface to be described is the administrative When OSS has completed building the administrative 

volume interface, which is used to access a large volume of data set and storing it on disks in the usual way for storage 



03/19/2004, EAST Version: 1.4.1 



US 6,496,791 Bl 

5 6 

of OSS virtual volumes, it signals LMS that the adminis- the LDI has become a vestigial communication channel. In 

trative volume is mounted. LMS then reads the administra- the present invention, the LDI is used to communicate policy 

live data set to obtain the requested status information. information to the OSS from the LMS in real time. 

Thus, standard channel commands are utilized to transfer A load display command transfers sixteen bytes of 

status information between OSS and LMS in an efficient 5 information, viewed as a message. The format utilized in the 

manner currently described embodiment includes a first byte set to 

j . « . . . . . » identify the message as an LMS to OSS communication, a 

Tli e above.descnbed administrative volume interface pro- Xo identify a particular request, thirteen bytes of 

vides a unique mechanism to transfer large volumes of request specific data> a check sum byte ^ of all £ reced _ 

status, control, and configuration information between OSS ing bytes ) > and a byte specifying the data length. The LDI 

and LMS. Such information is required in a TDE system to 10 messages are not intended to be displayed, 

permit effective management and operation of the TDE i n the currently described embodiment, the LMS library 

system by the LMS, driver uses the LDI to request these services: 

Another type of information unique to a TDE system is mountVirtualScratch( ); MountVirtualVolume( ); 

termed "policy" information. In one type of OSS, virtual keepMrtualVolume(); update VolumeStatus(); healthCheck( 

volume data transferred between the host is staged on disk 15 )i timeStamp( ); stage Virtual Volume; 

drives. Additionally, tape drives (the SCSI tape drives of destageVirtualVolume( ); and reuseSpace( ). 

FIG. 1) are also included and virtual volumes may be ^ mountVirtualScratch( ) request, lor example, speci- 

destaged from the disk drives to the tape drives. £ es a Y SET Name in thirteen bytes of the message. 

™ . , 4 „ 4 . . 4 . A 4 __~ Responding to the request, OSS mounts a volume it chooses 

Hie majority of optimizations available to tailor an OSS 2Q from a * Qf vol ^ mes whose names are a name 

to particular customer s requirements come from optimiza- allocated tQ the VSET and whose m status is 

tion of Urmng of vanous events in the Ufe-cycle of a virtual u^^ n meanm DOt in USC) contaimng on i y a ta5eL 

volume. Hjese ootimizations take the form of choices yolume ^ mounted takes Qn the medi3j performance and 

among various policies defined below. storage class attributes associated with the VSET as defaults. 

The task of managing policy decisions is simplified by 25 Subsequent LDI requests naming the chosen volume may be 

grouping attributes of virtual volumes as follows: used t0 alter certain of lhe attributes. 

those attributes which specify the kind of medium being The LDI is, by its nature, a one way communication 

emulated; channel: it was designed for the host computer to send 

those attributes which guide the choice of long term display messages to operators. However, one type of LDI 

storage media for data associated with the virtual 30 message supported by the present embodiment, the health 

volume; and check, is an example of the use of the LDI for information 

those attributes that direct the timing of data residency as gathering rather than expressing policy or requesting action. 

it passes through the OSS staging disks. The healthCheck( ) message format is sent from LMS as a 

Several examples of policies associated with a virtual poll request to determine the operational status of the OSS. 

volume are the performance class, media class, and storage 35 In critical situations the OSS must inform LMS and/or the 

class. operator that an event has occurred. These events are cur- 

The performance class specifies the attributes of a virtual rently associated with RAID space shortage and equipment 

volume that govern residency timing of the data of the failures. 

virtual volume on various media. It is called performance The LDI healthCheck( ) poll message is issued to the OSS 

class because altering the class changes user-perceived 40 on a regular basis, e.g., every 30 seconds. The health Check( 

performance, most notably mount delay. ) message contains a host system identifier and the date and 

The media class describes the attributes of a single kind time. If an error condition exists in OSS, the OSS ends the 

of media emulated by OSS, i.e., attributes such as technol- execution of the load display command with command - 

ogy type, media geometry, etc. Media classes are defined by failure status (unit check). The host reacts to unit check by 

the user and associated with a virtual volume at the time of 45 reading sense bytes firom the OSS and OSS includes in these 

its creation. An example of media class might be "3490- a distinctive "SUU ERP code" (X'42'). The "EC level" byte 

CST," defined as 550 foot, 3490 EDRC tapes. in the sense data contains the OSS system error code, an 

The storage class is a description of whether data is unsigned, 8-bit number indicating the highest priority error 

replicated, and how data is stored in OSS. A storage class is existing at the time. The OSS continues to respond to 

associated with a virtual volume at the time of its creation, 50 subsequent healthCheck( ) polls until the malfunction or 

and may be changed at any time that a volume is mounted other emergency is resolved (or goes away), 

as an output-only volume. Storage classes are defined by the The MVS operating system produces a LOGREC record 

user. An example of a storage class might be "VAULTED" and an IOS000I message for each SUU response but retries 

defined to direct the data for a virtual data to a single stacked the failing I/O. OSS, recognizing the retry, allows the 

image and single native image, with the intent that the native 55 command to complete without error. LMS intercepts the 

image will be stored offsite. MVS IOS000I message, deletes it, informs the operator of 

As these examples show, policy information must be the error condition, and then takes appropriate action, 

communicated to OSS in real time, for example when a An additional requirement of OSS is to provide multiple 

virtual volume is created or mounted. In the present channel interfaces to a host or multiple hosts. The controller 

embodiment, policy information is communicated by uncon- 60 utilized in the currently preferred embodiment is based on a 

ventional use of the standard load display interface (LDI). SPARC processor manufactured by SUN Microsystems. 

"Load Display" is a command issued to a 3490 tape drive A special interface has been designed to expand the 

which transmits an associated message, part of which is to standard interface provided with the SPARC computer, 

be displayed on a display pod associated with the tape drive. Additionally, redundancy is built into the interface to assure 

In an IBM 3490, the messages do not affect the operation of 65 the reliability required of a tape storage system, 

the tape drive, but were intended to communicate with a FIG. 3 is a high-level schematic diagram showing the 

human operator. With the advent of automatic tape libraries redundant, channel interface expansion chassis hardware 
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300. The CIF interface cards 12 reside in the expansion 
chassis 300. The chassis is divided into two halves 302 and 
304. Electrically, each half is separately powered and con- 
tains its own separate sets of SBus connections. Each half 
can contain up to four dual SBus base (DSB) boards 306, 
each of which can contain up to four channel interfaces 12 
(in this embodiment ESCON interface (EI) or block multi- 
plexer interface (BMUX) daughter cards) for a total of 
sixteen interfaces per half. Unless otherwise specified, the 
term "chassis" in the following refers to a chassis half. 

Each connection to the main processors) 320 is made via 
an SBus expander, consisting of an SBus adapter (SSA) 340 
that plugs into an SBus slot inside the SPARC, an intercon- 
necting cable 342, an SBus expander (SSE) 360, which 
connects to either two or four slots in the chassis. Each slot 
in the chassis can connect to two SSEs 360 via identical 
connectors. If each SSE is connected to four slots, then two 
separate SBus connections to SPARC(s) are provided for 
that chassis. If each SSE is connected to two slots, then each 
pair of slots contains connections to two SBuses, for a total 
of four separate SBuses per chassis [half], and a system total 
of eight SBus connections to SPARCs. 

FIG. 4 is a simplified block diagram of the DSB and CIF 
daughter cards shown in FIG. 3. A DSB 306 plugs into a slot 
in the chassis, and therefore connects to two SSEs 360, 
which are referred to as SBusO and SBusl. The are two SI64 
bridge chips on the DSB, one for each SBus. 

The SI64s 370 are connected to the [up to] four interface 
daughter cards on the DSB via a common i960-compatible 
bus 372. Also included on the DSB are some shared control/ 
status/interrupt support registers used for communication 
between the SPARC and the i960s. 

Thus, each DSB card 306 contains two SI-64*s, one for 
connection to each of two SPARC SBuses. On the i960 side, 
the two SI-64's connect to a common bus. Each EI (Escon 
Interface which contains an i960) board 12 carried by the 
DSB 306 has access to this common bus according to a 
scheme of arbitration supported by a small amount of DSB 
logic. This connection uses a set of transceivers to link 
temporarily an i960*s local bus and the common bus 372, 
which have the same design. Thus, each i960 can access and 
use either SI-64. 

The SI-64 is a Motorola product which operates as a 
"gateway" between the main bus of an i960 (call this the 
"local" bus) and the I/O bus (specifically SBus) of a SPARC 
machine. It can be programmed to perform direct memory 
access (DMA) transfers between memories of the two 
computers. It is therefore able to arbitrate accesses to each 
kind of bus and has address and length counters to effect one 
transfer at a time. Without using the DMA feature, the chip 
allows an i960 program to read from or write to an arbitrary 
SPARC memory location and allows a SPARC program to 
read from or write to an arbitrary i960 memory location. 

The SPARC system utilizes virtual addresses, which arc 
translated into physical addresses on the SBus, for devices 
on the SBUS. 

Access to the interface cards from the SBus (SPARC) is 
accomplished by dividing twenty-eight bit physical 
addresses allotted to the chassis ("SBus Slot") into four 
equal 64 MByte areas that select one of the four DSB boards 
in the chassis. The 64 MByte area of each DSB board is 
sub-divided into four areas of 16 MB which select one of the 
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interface daughter cards 12. The 16 MB area assigned to 
each interface card is further sub-divided into an 8 MB area 
that allows direct access into the interface card. The remain- 
ing 8 MB area is mapped into a common area that allows 

5 access to shared resources on the DSB such as dual-ported 
RAM, bridge registers, interrupt status, etc. 

As depicted in FIG. 5, redundancy is provided by con- 
necting a backup main processor 320(2) to a third SSE 

10 360(3), coupled to the same DSBs 380 as the first SSE 
360(1). Thus, it is possible for the backup main processor 
320(2) to take over the functions of the primary main 
processor 320(1) in the event of a failure. 

The redundant connection the Sbuses of the primary and 

15 backup main processors is depicted in FIG. 6. Each DSB 
contains up to four CIFs 12 (in this figure designated 
ESCON Interface Daughter Cards (Els)). 

The invention has now been described with reference to 

20 the preferred embodiments. Alternatives and substitutions 
will now be apparent to persons of skill in the art. For 
example, particular products such a SPARC processor and 
SI64 interface chips have been described. Other products 
may be substituted. Accordingly, it is not intended to limit 

25 the invention, except as provided by the appended claims. 
What is claimed is: 

1. A tape drive emulation interface, redundantly connect- 
ing a plurality of host channels to a main processor and a 
backup processor, each processor having a systembus of a 

30 tape drive emulation system, said interface comprising: 
a chassis having a plurality of slots, with each slot having 

at least first and second connectors; 
a plurality of dual systembus base boards, each coupled to 
35 one of said slots; 

a plurality of channel interfaces disposed on each dual 
systembus base board, each channel interface for inte- 
facing a host channel to input/output buffers of the main 
processor; 

40 a first bus expander, coupled to the first connector of said 
slots, for multiplexing said plurality of dual systembus 
base boards to a first bus expander port; 
a second bus expander, coupled to the second connector 
of said slots, for multiplexing said plurality of dual 
45 systembus base boards to a second bus expander port; 
a first set of signal lines coupling said first bus expander 

port to said main processor systembus; and 
a second set of signal lines coupling said second bus 
5Q expander port to said backup processor systembus. 

2. The tape drive emulation interface of claim 1 wherein 
at least one of the plurality of host channels couples to a host 
comprising a library management system and a plurality of 
applications. 

55 3. The tape drive emulation interface of claim 1 wherein 
at least one of the plurality of channel interfaces comprises: 
an embedded controller; 
a data formatter; and 

a memory coupled to the embedded controller and data 
60 formatter by an internal bus. 

4. The tape drive emulation interface of claim 3 wherein 
the data formatter is configured to compress data received 
from a host channel. 

5. The tape drive emulation interface of claim 3 wherein 
65 the data formatter is configured to compress data received 

from a host channel and decompress data sent to the host 
channel. 
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6. The tape drive emulation interface of claim 5 wherein 
four channel interfaces are disposed on each of the plurality 
of dual systembus base boards. 

7. The tape drive emulation interface of claim 1 wherein 
at least one of the plurality of dual systembus base boards 
comprises: 

a first bridge circuit coupled between the first connector 
and the plurality of channel interfaces; and 
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a second bridge circuit coupled between the second 

connector and the plurality of channel interfaces. 
8. The tape drive emulation interface of claim 1 wherein 
the first set of signal lines comprises a first expander cable, 
5 and the second set of signal lines comprises a second 
expander cable. 

***** 
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